Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Biometrics ; 79(4): 3307-3318, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37661821

RESUMO

For multivariate functional data, a functional latent factor model is proposed, extending the traditional latent factor model for multivariate data. The proposed model uses unobserved stochastic processes to induce the dependence among the different functions, and thus, for a large number of functions, may provide a more parsimonious and interpretable characterization of the otherwise complex dependencies between the functions. Sufficient conditions are provided to establish the identifiability of the proposed model. The performance of the proposed model is assessed through simulation studies and an application to electroencephalography data.


Assuntos
Eletroencefalografia , Modelos Estatísticos , Simulação por Computador , Processos Estocásticos
2.
Stat Med ; 41(17): 3349-3364, 2022 07 30.
Artigo em Inglês | MEDLINE | ID: mdl-35491388

RESUMO

We propose an inferential framework for fixed effects in longitudinal functional models and introduce tests for the correlation structures induced by the longitudinal sampling procedure. The framework provides a natural extension of standard longitudinal correlation models for scalar observations to functional observations. Using simulation studies, we compare fixed effects estimation under correctly and incorrectly specified correlation structures and also test the longitudinal correlation structure. Finally, we apply the proposed methods to a longitudinal functional dataset on physical activity. The computer code for the proposed method is available at https://github.com/rli20ST758/FILF.


Assuntos
Exercício Físico , Projetos de Pesquisa , Simulação por Computador , Humanos , Estudos Longitudinais
3.
Biometrics ; 78(2): 798-811, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-33594698

RESUMO

Soils have been heralded as a hidden resource that can be leveraged to mitigate and address some of the major global environmental challenges. Specifically, the organic carbon stored in soils, called soil organic carbon (SOC), can, through proper soil management, help offset fuel emissions, increase food productivity, and improve water quality. As collecting data on SOC are costly and time-consuming, not much data on SOC are available, although understanding the spatial variability in SOC is of fundamental importance for effective soil management. In this manuscript, we propose a modeling framework that can be used to gain a better understanding of the dependence structure of a spatial process by identifying regions within a spatial domain where the process displays the same spatial correlation range. To achieve this goal, we propose a generalization of the multiresolution approximation (M-RA) modeling framework of Katzfuss originally introduced as a strategy to reduce the computational burden encountered when analyzing massive spatial datasets. To allow for the possibility that the correlation of a spatial process might be characterized by a different range in different subregions of a spatial domain, we provide the M-RA basis functions weights with a two-component mixture prior with one of the mixture components a shrinking prior. We call our approach the mixture M-RA. Application of the mixture M-RA model to both stationary and nonstationary data show that the mixture M-RA model can handle both types of data, can correctly establish the type of spatial dependence structure in the data (e.g., stationary versus not), and can identify regions of local stationarity.


Assuntos
Carbono , Solo , Carbono/química , Solo/química , Análise Espacial
4.
Stat Biopharm Res ; 13(3): 270-279, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34790289

RESUMO

Longitudinal studies of rapid disease progression often rely on noisy biomarkers; the underlying longitudinal process naturally varies between subjects and within an individual subject over time; the process can have substantial memory in the form of within-subject correlation. Cystic fibrosis lung disease progression is measured by changes in a lung function marker (FEV1), such as a prolonged drop in lung function, clinically termed rapid decline. Choosing a longitudinal model that estimates rapid decline can be challenging, requiring covariate specifications to assess drug effect while balancing choices of covariance functions. Two classes of longitudinal models have recently been proposed: segmented and stochastic linear mixed effects (LMEs) models. With segmented LMEs, random changepoints are used to estimate the timing and degree of rapid decline, treating these points as structural breaks in the underlying longitudinal process. In contrast, stochastic LMEs, such as random walks, are locally linear but utilize continuously changing slopes, viewing bouts of rapid decline as localized, sharp changes. We compare commonly utilized variants of these approaches through an application using the Cystic Fibrosis Foundation Patient Registry. Changepoint modeling had the worst fit and predictive accuracy but certain covariance forms in stochastic LMEs produced problematic variance estimates.

5.
J Imaging ; 7(3)2021 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-34460701

RESUMO

This article describes an agricultural application of remote sensing methods. The idea is to aid in eradicating an invasive plant called Sosnowskyi borscht (H. sosnowskyi). These plants contain strong allergens and can induce burning skin pain, and may displace native plant species by overshadowing them, meaning that even solitary individuals must be controlled or destroyed in order to prevent damage to unused rural land and other neighbouring land of various types (mostly violated forest or housing areas). We describe several methods for detecting H. sosnowskyi plants from Sentinel-2A images, and verify our results. The workflow is based on recently improved technologies, which are used to pinpoint exact locations (small areas) of plants, allowing them to be found more efficiently than by visual inspection on foot or by car. The results are in the form of images that can be classified by several methods, and estimates of the cross-covariance or single-vector auto-covariance functions of the contaminant parameters are calculated from random functions composed of plant pixel vector data arrays. The correlation of the pixel vectors for H. sosnowskyi images depends on the density of the chlorophyll content in the plants. Estimates of the covariance functions were computed by varying the quantisation interval on a certain time scale and using a computer programme based on MATLAB. The correlation between the pixels of the H. sosnowskyi plants and other plants was found, possibly because their structures have sufficiently unique spectral signatures (pixel values) in raster images. H. sosnowskyi can be identified and confirmed using a combination of two classification methods (using supervised and unsupervised approaches). The reliability of this combined method was verified by applying the theory of covariance function, and the results showed that H. sosnowskyi plants had a higher correlation coefficient. This can be used to improve the results in order to get rid of plants in particular areas. Further experiments will be carried out to confirm these results based on in situ fieldwork, and to calculate the efficiency of our method.

6.
J Biomed Inform ; 117: 103698, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33617985

RESUMO

Advances in the modeling and analysis of electronic health records (EHR) have the potential to improve patient risk stratification, leading to better patient outcomes. The modeling of complex temporal relations across the multiple clinical variables inherent in EHR data is largely unexplored. Existing approaches to modeling EHR data often lack the flexibility to handle time-varying correlations across multiple clinical variables, or they are too complex for clinical interpretation. Therefore, we propose a novel nonstationary multivariate Gaussian process model for EHR data to address the aforementioned drawbacks of existing methodologies. Our proposed model is able to capture time-varying scale, correlation and smoothness across multiple clinical variables. We also provide details on two inference approaches: Maximum a posteriori and Hamilton Monte Carlo. Our model is validated on synthetic data and then we demonstrate its effectiveness on EHR data from Kaiser Permanente Division of Research (KPDOR). Finally, we use the KPDOR EHR data to investigate the relationships between a clinical patient risk metric and the latent processes of our proposed model and demonstrate statistically significant correlations between these entities.


Assuntos
Registros Eletrônicos de Saúde , Humanos , Distribuição Normal
7.
Entropy (Basel) ; 22(10)2020 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-33286848

RESUMO

Based on the application of the conditional mean rule, a sampling-recovery algorithm is studied for a Gaussian two-dimensional process. The components of such a process are the input and output processes of an arbitrary linear system, which are characterized by their statistical relationships. Realizations are sampled in both processes, and the number and location of samples in the general case are arbitrary for each component. As a result, general expressions are found that determine the optimal structure of the recovery devices, as well as evaluate the quality of recovery of each component of the two-dimensional process. The main feature of the obtained algorithm is that the realizations of both components or one of them is recovered based on two sets of samples related to the input and output processes. This means that the recovery involves not only its own samples of the restored realization, but also the samples of the realization of another component, statistically related to the first one. This type of general algorithm is characterized by a significantly improved recovery quality, as evidenced by the results of six non-trivial examples with different versions of the algorithms. The research method used and the proposed general algorithm for the reconstruction of multidimensional Gaussian processes have not been discussed in the literature.

8.
Artigo em Inglês | MEDLINE | ID: mdl-34262756

RESUMO

Covariance estimation is essential yet underdeveloped for analyzing multivariate functional data. We propose a fast covariance estimation method for multivariate sparse functional data using bivariate penalized splines. The tensor-product B-spline formulation of the proposed method enables a simple spectral decomposition of the associated covariance operator and explicit expressions of the resulting eigenfunctions as linear combinations of B-spline bases, thereby dramatically facilitating subsequent principal component analysis. We derive a fast algorithm for selecting the smoothing parameters in covariance smoothing using leave-one-subject-out cross-validation. The method is evaluated with extensive numerical studies and applied to an Alzheimer's disease study with multiple longitudinal outcomes.

9.
Biometrika ; 106(2): 267-286, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31097832

RESUMO

We introduce methods for estimating the spectral density of a random field on a [Formula: see text]-dimensional lattice from incomplete gridded data. Data are iteratively imputed onto an expanded lattice according to a model with a periodic covariance function. The imputations are convenient computationally, in that circulant embedding and preconditioned conjugate gradient methods can produce imputations in [Formula: see text] time and [Formula: see text] memory. However, these so-called periodic imputations are motivated mainly by their ability to produce accurate spectral density estimates. In addition, we introduce a parametric filtering method that is designed to reduce periodogram smoothing bias. The paper contains theoretical results on properties of the imputed-data periodogram and numerical and simulation studies comparing the performance of the proposed methods to existing approaches in a number of scenarios. We present an application to a gridded satellite surface temperature dataset with missing values.

10.
J Am Stat Assoc ; 114(525): 344-357, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31057192

RESUMO

The aim of this paper is to develop a novel class of functional structural equation models (FSEMs) for dissecting functional genetic and environmental effects on twin functional data, while characterizing the varying association between functional data and covariates of interest. We propose a three-stage estimation procedure to estimate varying coefficient functions for various covariates (e.g., gender) as well as three covariance operators for the genetic and environmental effects. We develop an inference procedure based on weighted likelihood ratio statistics to test the genetic/environmental effect at either a fixed location or a compact region. We also systematically carry out the theoretical analysis of the estimated varying functions, the weighted likelihood ratio statistics, and the estimated covariance operators. We conduct extensive Monte Carlo simulations to examine the finite-sample performance of the estimation and inference procedures. We apply the proposed FSEM to quantify the degree of genetic and environmental effects on twin white-matter tracts obtained from the UNC early brain development study.

11.
J Nonparametr Stat ; 31(4): 867-886, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-34393467

RESUMO

Improving estimation efficiency for regression coefficients is an important issue in the analysis of longitudinal data, which involves estimating the covariance matrix of errors. But challenges arise in estimating the covariance matrix of longitudinal data collected at irregular or unbalanced time points. In this paper, we develop a regularization method for estimating the covariance function and a stepwise procedure for estimating the parametric components efficiently in the varying-coefficient partially linear model. This procedure is also applicable to the varying-coefficient temporal mixed effects model. Our method utilizes the structure of the covariance function and thus has faster rates of convergence in estimating the covariance functions and outperforms the existing approaches in simulation studies. This procedure is easy to implement and its numerical performance is investigated using both simulated and real data.

12.
Comput Stat Data Anal ; 122: 101-114, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29861518

RESUMO

A joint design for sampling functional data is proposed to achieve optimal prediction of both functional data and a scalar outcome. The motivating application is fetal growth, where the objective is to determine the optimal times to collect ultrasound measurements in order to recover fetal growth trajectories and to predict child birth outcomes. The joint design is formulated using an optimization criterion and implemented in a pilot study. Performance of the proposed design is evaluated via simulation study and application to fetal ultrasound data.

13.
Stat Med ; 37(8): 1376-1388, 2018 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-29230836

RESUMO

In many studies, it is of interest to predict the future trajectory of subjects based on their historical data, referred to as dynamic prediction. Mixed effects models have traditionally been used for dynamic prediction. However, the commonly used random intercept and slope model is often not sufficiently flexible for modeling subject-specific trajectories. In addition, there may be useful exposures/predictors of interest that are measured concurrently with the outcome, complicating dynamic prediction. To address these problems, we propose a dynamic functional concurrent regression model to handle the case where both the functional response and the functional predictors are irregularly measured. Currently, such a model cannot be fit by existing software. We apply the model to dynamically predict children's length conditional on prior length, weight, and baseline covariates. Inference on model parameters and subject-specific trajectories is conducted using the mixed effects representation of the proposed model. An extensive simulation study shows that the dynamic functional regression model provides more accurate estimation and inference than existing methods. Methods are supported by fast, flexible, open source software that uses heavily tested smoothing techniques.


Assuntos
Previsões/métodos , Análise de Regressão , Antropometria , Estatura , Peso Corporal , Desenvolvimento Infantil , Pré-Escolar , Simulação por Computador , Interpretação Estatística de Dados , Feminino , Crescimento , Gráficos de Crescimento , Humanos , Lactente , Recém-Nascido , Masculino , Peru
14.
Spat Spatiotemporal Epidemiol ; 18: 24-37, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27494957

RESUMO

A problem often encountered in environmental epidemiological studies assessing the health effects associated with ambient exposure to air pollution is the spatial misalignment between monitors' locations and subjects' actual residential locations. Several strategies have been adopted to circumvent this problem and estimate pollutants concentrations at unsampled sites, including spatial statistical or geostatistical models that rely on the assumption of stationarity to model the spatial dependence in pollution levels. Although computationally convenient, the assumption of stationarity is often untenable for pollutants concentration, particularly in the near-road environment. Building upon the work of Fuentes (2001) and Schmidt et al. (2011), in this paper we present a non-stationary spatio-temporal model for three traffic-related pollutants in a localized near-road environment. Modeling each pollutant separately and independently, we express each pollutant's concentration as a mixture of two independent spatial processes, each equipped with a non-stationary covariance function with covariates driving the non-stationarity and the mixture weights.


Assuntos
Poluentes Atmosféricos/análise , Poluição do Ar , Monitoramento Ambiental , Modelos Estatísticos , Emissões de Veículos/análise , Cidades , Humanos , Michigan , Análise Espaço-Temporal
15.
Bayesian Anal ; 11(3): 649-670, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34457106

RESUMO

Functional data, with basic observational units being functions (e.g., curves, surfaces) varying over a continuum, are frequently encountered in various applications. While many statistical tools have been developed for functional data analysis, the issue of smoothing all functional observations simultaneously is less studied. Existing methods often focus on smoothing each individual function separately, at the risk of removing important systematic patterns common across functions. We propose a nonparametric Bayesian approach to smooth all functional observations simultaneously and nonparametrically. In the proposed approach, we assume that the functional observations are independent Gaussian processes subject to a common level of measurement errors, enabling the borrowing of strength across all observations. Unlike most Gaussian process regression models that rely on pre-specified structures for the covariance kernel, we adopt a hierarchical framework by assuming a Gaussian process prior for the mean function and an Inverse-Wishart process prior for the covariance function. These prior assumptions induce an automatic mean-covariance estimation in the posterior inference in addition to the simultaneous smoothing of all observations. Such a hierarchical framework is flexible enough to incorporate functional data with different characteristics, including data measured on either common or uncommon grids, and data with either stationary or nonstationary covariance structures. Simulations and real data analysis demonstrate that, in comparison with alternative methods, the proposed Bayesian approach achieves better smoothing accuracy and comparable mean-covariance estimation results. Furthermore, it can successfully retain the systematic patterns in the functional observations that are usually neglected by the existing functional data analyses based on individual-curve smoothing.

16.
Lifetime Data Anal ; 22(4): 504-30, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-26468013

RESUMO

In the analysis of censored survival data, simultaneous confidence bands are useful devices to help determine the efficacy of a treatment over a control. Semiparametric confidence bands are developed for the difference of two survival curves using empirical likelihood and compared with the nonparametric counterpart. Simulation studies are presented to show that the proposed semiparametric approach is superior, with the new confidence bands giving empirical coverage closer to the nominal level. Further comparisons reveal that the semiparametric confidence bands are tighter and, hence, more informative. For censoring rates between 10 and 40 %, the semiparametric confidence bands provide a relative reduction in enclosed area amounting to between 2 and 10 % over their nonparametric bands, with increased reduction attained for higher censoring rates. The methods are illustrated using an University of Massachusetts AIDS data set.


Assuntos
Probabilidade , Análise de Sobrevida , Humanos
17.
Stat Med ; 34(12): 2004-18, 2015 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-25762065

RESUMO

In long-term follow-up studies, irregular longitudinal data are observed when individuals are assessed repeatedly over time but at uncommon and irregularly spaced time points. Modeling the covariance structure for this type of data is challenging, as it requires specification of a covariance function that is positive definite. Moreover, in certain settings, careful modeling of the covariance structure for irregular longitudinal data can be crucial in order to ensure no bias arises in the mean structure. Two common settings where this occurs are studies with 'outcome-dependent follow-up' and studies with 'ignorable missing data'. 'Outcome-dependent follow-up' occurs when individuals with a history of poor health outcomes had more follow-up measurements, and the intervals between the repeated measurements were shorter. When the follow-up time process only depends on previous outcomes, likelihood-based methods can still provide consistent estimates of the regression parameters, given that both the mean and covariance structures of the irregular longitudinal data are correctly specified and no model for the follow-up time process is required. For 'ignorable missing data', the missing data mechanism does not need to be specified, but valid likelihood-based inference requires correct specification of the covariance structure. In both cases, flexible modeling approaches for the covariance structure are essential. In this paper, we develop a flexible approach to modeling the covariance structure for irregular continuous longitudinal data using the partial autocorrelation function and the variance function. In particular, we propose semiparametric non-stationary partial autocorrelation function models, which do not suffer from complex positive definiteness restrictions like the autocorrelation function. We describe a Bayesian approach, discuss computational issues, and apply the proposed methods to CD4 count data from a pediatric AIDS clinical trial.


Assuntos
Teorema de Bayes , Interpretação Estatística de Dados , Estudos Longitudinais , Projetos de Pesquisa , Análise de Variância , Fármacos Anti-HIV/administração & dosagem , Viés , Contagem de Linfócito CD4 , Criança , Simulação por Computador , Relação Dose-Resposta a Droga , Infecções por HIV/tratamento farmacológico , Infecções por HIV/imunologia , Humanos , Funções Verossimilhança , Modelos Lineares , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Zidovudina/administração & dosagem
18.
J Microsc ; 258(2): 87-104, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25689129

RESUMO

In the context of automated analyses of electron-backscattered-diffraction images, we present in this paper a novel method to automatically extract morphological properties of prior austenitic grains in martensitic steels based on raw crystallographic orientation maps. This quantification includes the estimation of the mean chord length in specific directions, with in addition the reconstruction of the mean shape of austenitic grains inducing anisotropic shape properties. The approach is based on the morphological measure of covariance on a decision curve of grain fidelity per disorientation angle. These efforts have been motivated by the need of realistic microstructures to perform micromechanical studies of grain boundary localized damage phenomenons in steels, one example being the type IV fracture phenomenon occurring in welded joints of grade P91/P92 steel. This failure is attributed to a change of the microstructure due to thermal gradients arising during the welding process. To precisely capture the relationships between microstructural changes and mechanical fields localization in a polycrystalline aggregate, we first need to achieve a reasonable stochastic model of its microstructure, which relies on a detailed knowledge of the microstructural morphology. As martensitic steels possess multiscale microstructures composed of prior austenitic grains, packets and laths, a relevant modelling strategy has to be proposed to account for the observed hierarchies. With this objective, this paper focuses on the larger scale entities present in the microstructure, namely, the austenitic grains.

19.
J Dairy Sci ; 98(2): 1296-309, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25434332

RESUMO

Three random regression models were developed for routine genetic evaluation of Danish, Finnish, and Swedish dairy cattle. Data included over 169 million test-day records with milk, protein, and fat yield observations from over 8.7 million dairy cows of all breeds. Variance component analyses showed significant differences in estimates between Holstein, Nordic Red Cattle, and Jersey, but only small to moderate differences within a breed across countries. The obtained variance component estimates were used to build, for each breed, their own set of covariance functions. The covariance functions describe the animal effects on milk, protein, and fat yields of the first 3 lactations as 9 different traits, assuming the same heritabilities and a genetic correlation of unity across countries. Only 15, 27, and 7 eigenfunctions with the largest eigenvalues were used to describe additive genetic animal effects and nonhereditary animal effects across lactations and within later lactations, respectively. These reduced-rank covariance functions explained 99.0 to 99.9% of the original variances but reduced the number of animal equations to be solved by 44%. Moderate rank reduction for nonhereditary animal effects and use of one-third-smaller measurement error correlations than obtained from variance component estimation made the models more robust against extreme observations. Estimation of the genetic levels of the countries' subpopulations within a breed was found sensitive to the way the breed effects were modeled, especially for the genetically heterogeneous Nordic Red Cattle. Means to ensure that only additive genetic effects entered the estimated breeding values were to describe the crossbreeding effects by fixed and random cofactors and the calving age effect by an age × breed proportion interaction, and to model phantom parent groups as random effects. To ensure that genetic variances were the same across the 3 countries in breeding value estimation, as suggested by the variance component estimates, the applied multiplicative heterogeneous variance adjustment method had to be tailored using country-specific reference measurement error variances. Results showed the feasibility of across-country genetic evaluation of cows and sires based on original test-day phenotypes. Nevertheless, applying a thorough model validation procedure is essential throughout the model building process to obtain reliable breeding values.


Assuntos
Bovinos/genética , Lactação/genética , Leite/química , Modelos Estatísticos , Algoritmos , Análise de Variância , Animais , Cruzamento , Gorduras/análise , Feminino , Heterogeneidade Genética , Variação Genética , Vigor Híbrido , Hibridização Genética , Proteínas do Leite/análise , Proteínas do Leite/genética , Fenótipo , Análise de Regressão , Pesquisa , Especificidade da Espécie
20.
Scand Stat Theory Appl ; 40(1): 119-137, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23599558

RESUMO

Spatial Cox point processes is a natural framework for quantifying the various sources of variation governing the spatial distribution of rain forest trees. We introduce a general criterion for variance decomposition for spatial Cox processes and apply it to specific Cox process models with additive or log linear random intensity functions. We moreover consider a new and flexible class of pair correlation function models given in terms of normal variance mixture covariance functions. The proposed methodology is applied to point pattern data sets of locations of tropical rain forest trees.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...